class: center, middle, inverse, title-slide .title[ # Three common mistakes in statistics and how to avoid them ] .author[ ### Elizabeth Pankratz ] .institute[ ### Department of Psychology
The University of Edinburgh ] --- ## Something you won't be able to unsee -- .pull-left[  Reeder et al. (2017) in **Journal of Memory and Language.** ] .pull-right[  Elazar et al. (2022) in **Cognitive Science.** <br>  Harrigan et al. (2022) in **Language.** ] --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] -- .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as continuous numeric. ] .pull-right[ ] -- .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] -- .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- ### The data: Students' anxiety ratings for "Going to ask my statistics teacher for individual help with material I am having difficulty understanding." -- <img src="data:image/png;base64,#demo_files/figure-html/bar-aggregated-1.png" width="576" style="display: block; margin: auto;" /> --- ### The data: Students' anxiety ratings for "Going to ask my statistics teacher for individual help with material I am having difficulty understanding." .pull-left[ ``` r slice(anx, 45:50) ``` ``` ## # A tibble: 6 × 3 ## unique_id gender rating ## <chr> <chr> <dbl> ## 1 7d28c303 Female/Woman 4 ## 2 7d55383a Another Gender 4 ## 3 8116550a Female/Woman 1 ## 4 83491ff9 Female/Woman 4 ## 5 8450f8ad Male/Man 2 ## 6 876547d6 Female/Woman 3 ``` ] -- .pull-right[ `rating` looks like numbers, and R treats it like numbers, as `dbl`. So it's tempting to manipulate it like numbers. ``` r mean(anx$rating) ``` ``` ## [1] 2.868054 ``` ] --- ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- count:false ## Why Likert scale ratings are not continuous numeric .center[  ] --- ## Remember: We are smarter than R is Store categorical variables as factors. ``` r anx <- anx |> mutate(rating = factor(rating)) ``` -- Now it's impossible to incorrectly treat them as if they're numeric! ``` r mean(anx$rating) ``` ``` ## [1] NA ``` --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ ] .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- count:false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- ## Model ordinal data with `polr()` <!-- polr = **P**roportional **O**dds **L**ogistic **R**egression --> -- ``` r library(MASS) # MASS contains the polr() function anx_fit1 <- polr( rating ~ 1, # intercept-only model, to start data = anx, Hess = TRUE, method = 'probit' # ask me in the Q+A! ) ``` --- ## Model ordinal data with `polr()` ``` r summary(anx_fit1) ``` ``` ## Call: ## polr(formula = rating ~ 1, data = anx, Hess = TRUE, method = "probit") ## ## No coefficients ## ## Intercepts: ## Value Std. Error t value ## 1|2 -0.8420 0.0157 -53.7268 ## 2|3 -0.1678 0.0138 -12.1462 ## 3|4 0.3833 0.0141 27.1512 ## 4|5 1.0339 0.0168 61.6193 ## ## Residual Deviance: 26596.28 ## AIC: 26604.28 ``` --- ## What do those `Intercepts` mean? -- <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal-1.png" width="864" style="display: block; margin: auto;" /> ??? - imagine that there's some underlying continuous normal distribution of anxiety, assumed standard normal [show normal distrib] - ppl with high anxiety are more likely to give high responses, ppl with low anxiety more likely to give low responses (could do emojis relating to anxiety:
,
) - so to estimate how different anxiety levels translate to different responses on the 1--5 scale, we draw thresholds on that distribution [add thresholds] - ppl with anxiety in this bin will respond with 1, in this bin with 2, etc. - and those thresholds, the cutpoints btwn ratings, are the intercepts. - [show intercept estimates, put thoes same numbers on the thresholds] - normal distribution assumption is from method = probit. other methods assume other underlying distributions, but the idea of thresholds is the same. --- count: false ## What do those `Intercepts` mean? <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal2-1.png" width="864" style="display: block; margin: auto;" /> --- count: false ## What do those `Intercepts` mean? <img src="data:image/png;base64,#demo_files/figure-html/plot-underlying-normal3-1.png" width="864" style="display: block; margin: auto;" /> --- #### How does a student's gender affect how they respond to "Going to ask my statistics teacher for individual help with material I am having difficulty understanding"? -- .pull-left[ <img src="data:image/png;base64,#demo_files/figure-html/plot-gender-bars-1.png" width="504" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="data:image/png;base64,#demo_files/figure-html/normals-stacked-onlyfem-1.png" width="504" style="display: block; margin: auto;" /> ] --- .center[ <img src="data:image/png;base64,#demo_files/figure-html/fem-normal-solo-1.png" width="936" style="display: block; margin: auto;" /> ] --- count: false <img src="data:image/png;base64,#demo_files/figure-html/fem-mal-normals-1.png" width="936" style="display: block; margin: auto;" /> --- count: false <img src="data:image/png;base64,#demo_files/figure-html/all-gender-normals-1.png" width="936" style="display: block; margin: auto;" /> --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- count: false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- ## Are the effects of `gender` significant? ``` ## Coefficients: ## Value Std. Error t value ## genderMale/Man -0.3280 0.03015 -10.880 ## genderAnother Gender 0.4846 0.11992 4.041 ``` No *p*-values in the model summary. -- But it's common practice to compare these *t*-values to a standard normal distribution. -- <img src="data:image/png;base64,#demo_files/figure-html/zscore-mm-1.png" width="648" style="display: block; margin: auto;" /> -- <img src="data:image/png;base64,#demo_files/figure-html/zscore-ag-1.png" width="648" style="display: block; margin: auto;" /> ??? Since both *p*-values are below 0.05: - we CAN reject the null hypothesis that gender has no effect on ratings. - **we CANNOT conclude that there really is an effect of gender.** --- ### Why don't significant *p*-values mean an effect exists? -- Because we can also get significant *p*-values when there really is *no* effect. -- .pull-left[ No difference in the true population: <img src="data:image/png;base64,#demo_files/figure-html/true-skew-probdist-1.png" width="504" style="display: block; margin: auto;" /> ] -- .pull-right[ A possible random sample (*n* = 50 per group): <img src="data:image/png;base64,#demo_files/figure-html/simdat-1.png" width="504" style="display: block; margin: auto;" /> ] --- ### Why don't significant *p*-values mean an effect exists? ``` r sim_fit <- polr(rating ~ group, data = simdat, method = 'probit', Hess = TRUE) summary(sim_fit) ``` ``` ## Coefficients: ## Value Std. Error t value ## groupGroup B -0.4479 0.2229 -2.009 ``` <br> -- <img src="data:image/png;base64,#demo_files/figure-html/zscore-sim-1.png" width="648" style="display: block; margin: auto;" /> <br> So *p* is significant, but in the true population, Group A and Group B were identical! --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ ] --- count:false .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ Understand that significant *p*-values can arise even if no effect exists in the real world. ] --- <img src="data:image/png;base64,#demo_files/figure-html/anx-normal-end1-1.png" width="864" style="display: block; margin: auto;" /> --- count:false <img src="data:image/png;base64,#demo_files/figure-html/anx-normal-end2-1.png" width="864" style="display: block; margin: auto;" /> --- .pull-left[ ## The mistake ] .pull-right[ ## How you'll avoid it ] <!-- --> .pull-left[
**A common R mistake:** Letting R treat all variables that consist of numbers as numeric. ] .pull-right[ When you know a variable is categorical, tell R that using `factor()`. ] <!-- --> .pull-left[
**An advanced stats mistake:** Modelling categorical, ordinal data as if it were numeric. ] .pull-right[ Apply and interpret ordinal regression models (e.g., `polr()` from `MASS`). ] <!-- --> .pull-left[
**A foundational stats mistake:** Interpreting a significant *p*-value as evidence that an effect exists in the real world. ] .pull-right[ Understand that significant *p*-values can arise even if no effect exists in the real world. ] -- .center[**Thank you!
Time for questions!**] --- count: false ## References Elazar, A., Alhama, R. G., Bogaerts, L., Siegelman, N., Baus, C., & Frost, R. (2022). When the "tabula" is anything but "rasa": What determines performance in the auditory statistical learning task? *Cognitive Science*, 46(2), e13102. Harrigan, K., Hogoboom, A., & Cochrane, L. (2022). Furthering student engagement: Lab sections in introductory linguistics. *Language*, 98(4), e199–e223. Reeder, P. A., Newport, E. L., & Aslin, R. N. (2017). Distributional learning of subcategories in an artificial grammar: Category generalization and subcategory restrictions. *Journal of Memory and Language*, 97, 17–29. Terry, J., Ross, R. M., Nagy, T., Salgado, M., Garrido-Vásquez, P., Sarfo, J. O., Cooper, S., Buttner, A. C., Lima, T. J. S., Öztürk, İ., Akay, N., Santos, F. H., Artemenko, C., Copping, L. T., Elsherif, M. M., Milovanović, I., Cribbie, R. A., Drushlyak, M. G., Swainston, K., … Field, A. P. (2023). Data from an International Multi-Centre Study of Statistics and Mathematics Anxieties and Related Variables in University Students (the SMARVUS Dataset). *Journal of Open Psychology Data*, 11(1), 8. --- count: false ## Some really nice resources - Jamieson's (2004) paper **[Likert scales: How to (ab)use them.](https://onlinelibrary.wiley.com/doi/10.1111/j.1365-2929.2004.02012.x)** - UCLA Statistical Methods and Data Analytics's web page **[Ordinal Logistic Regression.](https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/)** - Kurz' (2021) blog post **[Notes on the Bayesian cumulative probit.](https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/)** - Vasishth and Nicenboim's (2016) paper **[Statistical Methods for Linguistic Research: Foundational Ideas – Part I.](https://doi.org/10.1111/lnc3.12201)** - Gelman and Hill's (2007) book **[Data Analysis Using Regression and Multilevel/Hierarchical Models.](https://www.cambridge.org/highereducation/books/data-analysis-using-regression-and-multilevel-hierarchical-models/32A29531C7FD730C3A68951A17C9D983)**